Briefings in Bioinformatics
◐ Oxford University Press (OUP)
Preprints posted in the last 90 days, ranked by how well they match Briefings in Bioinformatics's content profile, based on 326 papers previously published here. The average preprint has a 0.25% match score for this journal, so anything above that is already an above-average fit.
Cui, T.; Wang, Z.; Wang, T.
Show abstract
AI-based molecular dynamics simulation brings ab initio calculations to biomolecules in an efficient way, in which the machine learning force field (MLFF) locates at the central position by accurately predicting the molecular energies and forces. Most existing MLFFs assume localized interatomic interactions, limiting their ability to accurately model non-local interactions, which are crucial in biomolecular dynamics. In this study, we introduce ViSNet-PIMA, which efficiently learns non-local interactions by physics-informed multipole aggregator (PIMA) and accurately encodes molecular geometric information. ViSNet-PIMA outperforms all state-of-the-art MLFFs for energy and force predictions of different kinds of biomolecules and various conformations on MD22 and AIMD-Chig datasets, while adapting the PIMA blocks into other MLFFs further achieves 55.1% performance gains, demonstrating the superiority of ViSNet-PIMA and the universality of the model design. Furthermore, we propose AI2BMD-PIMA to incorporate ViSNet-PIMA into AI2BMD simulation program by introducing "Transfer Learning-Pretraining-Finetuning" scheme and replacing molecular mechanics-based non-local calculations among protein fragments with ViSNet-PIMA, which reduces AI2BMDs energy and force calculation errors by more than 50% for different protein conformations and protein folding and unfolding processes. ViSNet-PIMA advances ab initio calculation for the entire biomolecules, amplifying the application values of AI-based molecular dynamics simulations and property calculations in biochemical research.
Kaira, V. S.; Kudari, Z. D.; P, S. S.; Bhat, R.; G, J.
Show abstract
Drug-target interaction prediction is significant in the hit identification phase of drug discovery, enabling the identification of potential drug candidates for downstream optimization. Traditional computational methods have some drawbacks in their ability to represent 3D structural data for both molecules and target proteins, which is required for the intricate protein-ligand interactions that regulate binding affinity. In this approach, we propose a graph transformer-based model (GTStrDTI) that combines an intragraph attention mechanism with cross-modal attention to enrich the representation of both the drug molecule and target protein. This approach comprehensively models both intramolecular structural features and intermolecular interactions, thereby enhancing binding affinity prediction performance. A thorough evaluation on benchmark datasets such as KIBA, DAVIS, and BindingDB_Kd shows that our approach surpasses the state-of-the-art methods under challenging target cold-start settings. Our analysis found that augmenting graph-based 3D structural protein target (C-alpha contact graphs from PDB with threshold distance of 5[A]) and incorporating molecule adjacency information, boosts predictive performance, thus contributing towards narrowing the gap between computational and experimental research.
Haque, N.; Mazed, A.; Ankhi, J. N.; Uddin, M. J.
Show abstract
Accurate classification of SARS-CoV-2 genomic variants is essential for effective genomic surveillance, yet it is challenged by extreme class imbalance, limited representation of rare variants, and distribution shifts in real-world sequencing data. In this study, we employed hybrid RF-SVM framework designed for robust detection of rare SARS-CoV-2 variants. It integrates a random forest and a polynomial-kernel based support vector machine to enhance sensitivity to minority classes while maintaining overall predictive stability. We systematically compared classical machine learning models, deep learning approaches, and hybrid strategies under both standard and distribution-shifted evaluation settings. Our results show that classical models using TF-IDF-based k-mer features outperform deep learning methods on macro-averaged performance metrics. The Random Forest classifier using TF-IDF Feature achieved the best overall performance, with a macro-averaged F1-score of 0.8894 and an accuracy of 96.3%. The model also demonstrated strong generalization ability, as evidenced by stable cross-validation performance (CV accuracy = 0.9637). Hybrid RF-SVM model further improves rare variant detection under severe class imbalance. Calibration analysis indicates reliable probability estimates for common variants, although challenges persist for minority classes. Overall, this study highlights the limitations of deep learning in highly imbalanced genomic settings and demonstrates that carefully designed hybrid machine learning approaches provide an effective and interpretable solution for rare SARS-CoV-2 variant detection.
Liang, L.; Zhao, K.
Show abstract
Accurate quality assessment of predicted protein-protein complex structures remains a major challenge. Existing graph-based quality assessment methods often treat the entire complex as a homogeneous graph, which obscures the physical distinction between intra-chain folding stability and inter-chain binding specificity. In this study, we introduce TriGraphQA, a novel triple graph learning framework designed for model quality assessment of protein complexes. TriGraphQA explicitly decouples monomeric and interfacial representations by constructing three geometric views: two residue-node graphs capturing the local folding environments of individual chains, and a dedicated contact-node graph representing the binding interface. Crucially, we propose an interface context aggregation module to project context-rich embeddings from the monomers onto the interface, effectively fusing multi-scale structural features. We conducted comprehensive tests on several challenging benchmark datasets, including Dimer50, DBM55-AF2, and HAF2. The results show that TriGraphQA significantly outperforms state-of-the-art single-model methods. TriGraphQA consistently achieves the highest global scoring correlations and lower top-ranking losses. Consequently, TriGraphQA provides a powerful evaluation tool for protein-protein docking, facilitating the reliable identification of near-native assemblies in large-scale structural modeling and molecular recognition studies.
Popov, N. S.; Panova, V. V.; Molchanova, M.; Gurov, S.; Lukashev, A. N.; Manolov, A.; Ilina, E. N.
Show abstract
The emergence of unidentified pathogens, or "Disease X," poses a significant threat to global health, necessitating the development of proactive surveillance strategies for the wildlife and human virosphere. Since novel viruses often lack universal genetic markers or known homologs, this study evaluates four reference-independent computational pipelines: coverage-based, k-mer-based, nucleotide clustering, and Large Language Model (LLM)-based designed to detect spreading organisms by comparing distinct metagenomic datasets. Using a real-world pandemic dataset of human nasopharyngeal RNA-seq runs and a semi-synthetic dataset enriched with divergent Egovirales sequences, we measured the sensitivity, selectivity, and computational efficiency of each approach. The coverage-based method proved most robust, consistently achieving 100% genome coverage of SARS-CoV-2 and maintaining high selectivity even at low viral concentrations, though it required extensive computational resources (20 days of CPU time for 2B reads). In contrast, the k-mer-based approach offered a tenfold reduction in execution time and high selectivity but was sensitive to data depletion, failing to detect targets at very low abundances. The clustering-based pipeline performed effectively at moderate concentrations but suffered from sequence fragmentation in sparse data, while the LLM-based method (using ViraLM), despite its efficiency, exhibited critically low selectivity due to current latent space partitioning limitations. These results demonstrate that while k-mer and LLM-based tools provide rapid screening capabilities, the coverage-based approach remains the most reliable for sensitive pathogen discovery. Ultimately, these reference-independent workflows are essential for illuminating metagenomic "dark matter" and establishing early warning systems for emerging infectious diseases
Shi, W.; Shen, C.; Liu, Y.; Xiao, Q.; Luo, J.
Show abstract
MotivationSpatial transcriptomics enables gene expression profiling within intact tissue sections, providing an important basis for analyzing tissue organization, cellular heterogeneity, and microenvironmental interactions. However, existing spatial structure identification methods often integrate spatial information using fixed neighborhoods or predefined smoothing scales, which limits their ability to adapt to region-specific structural heterogeneity. In homogeneous regions, broader spatial smoothing can help preserve continuous tissue structures, whereas in regions with complex boundaries or mixed cell populations, excessive smoothing may obscure local expression differences and fine-scale structural changes. Therefore, it is necessary to develop an adaptive graph learning framework that can adjust the range of spatial information integration according to tissue structural heterogeneity. ResultsIn this study, we propose HAST, a heterogeneity-driven adaptive-scale graph learning framework for spatial transcriptomics. HAST adaptively determines graph filtering scales according to spatial structural heterogeneity, enabling flexible information aggregation across different tissue regions. It further decomposes gene expression signals into low-frequency structural components and high-frequency residual components, thereby jointly modeling global spatial continuity and local expression variations. Experiments on high-resolution spatial transcriptomics datasets show that HAST improves spatial structure identification and cross-section generalization. Tumor-enriched cluster identification and neighborhood enrichment analysis further demonstrate its ability to characterize tumor-associated spatial regions and microenvironmental organization.
Zhang, H.; Zheng, G.; Xu, Z.; Zhao, H.; Cai, S.; Huang, Y.; Zhou, Z.; Wei, Y.
Show abstract
Missense variants are a common type of genetic mutation that can alter the structure and function of proteins, thereby affecting the normal physiological processes of organisms. Accurately distinguishing damaging missense variants from benign ones is of great significance for clinical genetic diagnosis, treatment strategy development, and protein engineering. Here, we propose the VarDCL method, which ingeniously integrates multimodal protein language model embeddings and self-distilled contrastive learning to identify subtle sequence and structural differences before and after protein mutations, thereby accurately predicting pathogenic missense variants. First, leveraging sequence and structural information before and after mutations, VarDCL generates sequence-structural multimodal features via different language models. It incorporates both global and local perspectives of feature embeddings to provide the model with dynamic, multimodal, and multi-view input data. Additionally, a Self-distilled Contrastive Learning (SDCL) module was proposed to enable more effective information integration and feature learning, enhancing the models ability to detect sequence and structural changes induced by mutations. Within this module, the multi-level contrastive learning framework excels at capturing information differences before and after mutations within the same modality; meanwhile, the feature self-distillation mechanism effectively utilizes high-level fused features to guide the learning of low-level differential features, facilitating information interaction across different modalities. The VarDCL framework not only ensures the models capacity to learn dynamic changes pre- and post-mutation but also significantly improves cross-modal information interaction between sequence and structure, thereby remarkably boosting the models performance in distinguishing pathogenic mutations from benign ones. To validate the effectiveness of VarDCL, extensive experiments were conducted. The ablation study demonstrates that all key components of VarDCL contribute significantly. On an independent test set containing 18,731 clinical variants, VarDCL achieved an AUC of 0.917, an AUPR of 0.876, an MCC of 0.690, and an F1-score of 0.789, outperforming 21 state-of-the-art existing methods. Benchmark analysis shows that VarDCL can be utilized as an accurate and potent tool for predicting missense variant effects.
Chen, Y.; Giuliano, V.; Dacillo, I.; Lin, W.; Yan, Y.; Luo, P.
Show abstract
Accurate prioritization of T-cell receptor (TCR)-epitope interactions and identification of tumor-reactive T cells are important but difficult steps in immunotherapy-oriented bioinformatics workflows. Existing methods typically address these tasks separately and either model TCR-epitope pairs as independent observations or rely primarily on transcriptomic signatures. In this study, we present TRACE (TCR-epitope pRioritization And T-Cell idEntification), a graph-based computational workflow that unifies both applications within a single heterogeneous graph framework. The protocol represents TCRs, epitopes, and T cells as typed nodes connected by similarity and association edges, and combines pretrained sequence embeddings with edge-aware graph attention, Laplacian positional encoding, and bidirectional cross-domain attention. Applied to the IEDB and VDJdb benchmarks, TRACE achieved AUROC/AUPR values of 0.937/0.922 and 0.992/0.990, respectively, outperforming five state-of-the-art algorithms. In addition, on a single-cell RNA-seq dataset, the workflow achieved an AUROC of 0.984 and an AUPR of 0.984, substantially exceeding transcriptomic signature-based baselines for tumor-reactive T-cell identification. Ablation analysis showed that Laplacian positional encoding provided the largest performance gain, particularly in sparse graph settings. These results suggest that heterogeneous graph modeling can serve as a practical protocol for integrating receptor sequence, antigen context, and cellular phenotype in computational immunology.
Wang, Q.; Shi, x.
Show abstract
Accurate prediction of drug synergy is paramount for developing effective combination therapies and advancing personalized medicine. Although methods based on graph neural networks (GNNs) have become a prevalent approach, they often treat molecules as flat graphs of connected atoms, thus overlooking their inherent hierarchical structure (i.e., atoms forming functional groups) and the critical topological information that governs molecular interactions. To address this limitation, we introduce TopoFuseNet, a novel hierarchical graph representation learning framework that integrates multi-scale topological features. The core innovations of TopoFuseNet include: 1) The first-ever application of "Group Centrality" from network science to cheminformatics, enabling the identification and quantification of functional groups crucial to drug activity; 2) A systematic, multi- path strategy to seamlessly integrate node-level (atom) and group-level (functional group) topological features into a Graph Attention Network (GAT) via feature augmentation, attention biasing, and hierarchical pooling; 3) A Differential Transformer module to deeply fuse multi-modal features learned from sequences, fingerprints, and our proposed hierarchical graph representations. Extensive experiments on two large-scale benchmark datasets, DrugComb and DrugCombDB, demonstrate that TopoFuseNet significantly outperforms state-of-the-art methods across multiple key metrics, including AUC, AUPRC, and F1-score, while exhibiting exceptional generalization robustness under various stringent cold-start scenarios. In-depth ablation studies further confirm the effectiveness and necessity of each proposed innovative module. Furthermore, multi-scale interpretability analysis and zero-shot cross-domain transfer experiments reveal that the model successfully captures molecular interaction rules with clear pharmacological significance, demonstrating immense practical potential for discovering novel combination therapies through large-scale virtual screening. Our work not only delivers a superior model for drug synergy prediction, but more importantly, it establishes a novel and scalable paradigm for effectively integrating hierarchical molecular structures and topological information into GNNs.
Han, S.; Sztanka-Toth, T.; Senel, E.; Elnaggar, A.; Patel, J.; Mansi, T.; Smirnov, D.; Greshock, J.; Javidi, A.
Show abstract
Single-cell foundation models enable reusable representations and streamlined analysis workflows, yet rigorous evaluation of their performance and robustness in real-world pharmaceutical settings remain underexplored. Here, we benchmarked leading single-cell foundation models (scGPT; scGPT_CP, a continually pretrained checkpoint of scGPT; scFoundation; scMulan; CellFM) against established baseline methods (scVI; Harmony) for data integration using over 1.5 million cells from clinical and preclinical samples. Performance was assessed using well-established and complementary metrics for technical correction and biological structure preservation. We further introduced robustness-oriented rankings to summarize metric trade-offs and quantify performance consistency across datasets and evaluation settings. Our findings show that fine-tuning improved technical correction performance; among the foundation models, fine-tuned scGPT_CP performed best. However, the baseline scVI was the top overall performer, ranking first by our multi-metric Leximax ranking and achieving the highest Pareto Front-1 hit. Collectively, our study provides practical insights for adapting foundation models to real-world drug design and development.
Hirota, K.; Higashi, K.; Kurokawa, K.; Yamada, T.
Show abstract
Recent advances in language models for natural language processing have spread to the field of genomics, driving the development of genome language models (gLMs) to decipher genomic information. Cutting-edge long-context gLMs are promising approaches for understanding and designing biological complexity, but their evaluation remains underdeveloped. In this study, we introduce BGCs-Bench, a unified benchmark focused on biosynthetic gene clusters for assessing long-range genomic modeling on three downstream tasks: biosynthetic class prediction, taxonomic classification and coding sequence annotation. Using BGCs-Bench, we perform systematic and layer-wise evaluations of the embedding representations of long-context gLMs, demonstrating that layer selection is crucial for downstream task performance. In addition to the evaluation results, the logit lens analysis of autoregressive gLMs suggests that StripedHyena-based models consist of earlier layers to encode biologically meaningful information from input DNA sequences and deeper layers to optimize embeddings for sequence generation. These findings provide insights for more effective development and application of long-context gLMs.
Wang, T.; Liao, S.; Qi, Y.; Zhang, Z.
Show abstract
Liquid-liquid phase separation (LLPS) underlies the formation of biomolecular liquid condensates (also referred to membraneless organelles, MLOs), which are essential for spatially organizing various biochemical processes within cells. Proteins that play a key role in driving condensates formation are termed phase-separating proteins (PSPs). Given experimental identification of PSPs remains labor-intensive and time-consuming, multiple computational tools have been developed based on empirical features or deep learning. In this study, we propose SSPSPredictor, a novel multimodal predictive model for PSPs with folded or intrinsically disordered structures, leveraging the fusion of sequence information from a protein language model ESM-2 and structural insights from a graph neural network GVP. Compared with existing tools, SSPSPredictor achieves balanced performance in identifying endogenous PSPs, predicting relative LLPS propensities, and recognizing key regions that drive LLPS. Moreover, SSPSPredictor exhibits good interpretability in identifying driving regions along protein sequences, although no relevant supervision was provided during training. Further predictive analysis of the human proteome using SSPSPredictor reveals that the proportion of intrinsically disordered proteins (IDPs) undergoing LLPS is significantly higher than that of folded proteins. In addition, pathogenic variants, especially those located in disordered regions, exhibit higher LLPS propensity than other mutations, uncovering a link between LLPS and diseases at the amino acid level.
Cisterna Garcia, A.; Gonzalez Lopez, A. M.; Vozi, A.; Esteban, M. A.; Egli, A.; Jutzeler, C.; Palma, J.; Sanchez-Ferrer, A.; Botia, J. A.
Show abstract
Antimicrobial resistance (AMR) has a profound impact on animal and human health and is associated with substantial morbidity, mortality and public health costs. There is a clear need to develop novel, effective antibiotic agents, which can overcome the current AMR crisis. Antimicrobial peptides (AMPs) may offer such a solution and have attracted growing attention for their potential to combat AMR. In parallel, the growing availability of peptide sequences in public databases has stimulated the development of numerous machine learning and deep learning tools to predict antimicrobial activity computationally. However, it remains unclear how reliably these tools can be compared, as existing studies often rely on heterogeneous datasets and inconsistent evaluation protocols that may lead to data leakage and inflated performance estimates. This raises a central question: what evaluation criteria and benchmark resources are needed to enable fair, reproducible, and biologically meaningful assessment of AMP prediction tools? We address this question by focusing specifically on antibacterial peptides (ABPs). We first provide an overview of AMP databases relevant to antibacterial activity and compare their content, redundancy, and experimental metadata. We then critically assess existing computational tools for ABP prediction, highlighting key limitations related to dataset construction, affinity to certain sequences, data leakage, and inconsistent performance reporting. Based on these limitations, we propose a reference evaluation framework designed to improve comparability, reproducibility, and practical utility in ABP prediction. Finally, we provide targeted recommendations for AMP databases and future tool development to support more robust progress in the computational discovery of ABPs.
Gronning, A. G. B.; Scheele, C.
Show abstract
Peptides are gaining increasing attention as therapeutic agents. Already, peptide-based therapeutics play a key role in the treatment of diverse diseases, including diabetes, obesity, and other complex disorders, and their clinical relevance is expected to expand further in the coming years. Technological and computational advances have substantially enriched peptidomics, massively increasing the scale and depth of peptide identification. As a result, increasingly large and information-rich datasets are now available for downstream analysis and experimental validation. However, the rapid expansion of peptidomics datasets also leads to a corresponding increase in search space, complicating the efficient identification of peptides relevant to specific biological or clinical questions. To address this challenge, we present PepHammer, a lightweight web-based tool for bioactive peptide matching and identification. PepHammer allows users to input up to 10000 peptides (2-150 amino acids in length) and compare them against extensive databases of peptides with predicted or experimentally validated bioactivities and tissue associations using Hamming distance, Grantham distance, as well as partial or exact matching strategies. Via an example study of human milk peptidomics, we demonstrate that PepHammer rapidly provides an overview of the bioactivity and tissue-relational landscape, serving as a starting point for downstream analyses. PepHammer thus enables efficient exploration of large-scale peptidomics datasets and facilitates the identification of biologically relevant peptides.
Pham, H. T.; Huynh, B.; Nguyen-Vo, T.-H.
Show abstract
Antimicrobial peptides (AMPs) are promising therapeutic candidates against rising antimicrobial resistance, yet progress in AMP prediction is hampered by the lack of benchmark datasets that address homology leakage, negative set reliability, and distributional diversity. Existing AMP databases, designed as biological repositories, do not enforce the controlled partitioning required for rigorous machine learning evaluation. We present GenPept-Curated-2025, a curated, class-balanced benchmark of 11,000 peptide sequences (5,500 AMP / 5,500 non-AMP) derived from Bacteria, Archaea, and Fungi, and sourced exclusively from GenPept/NCBI Protein. The dataset was constructed through a reproducible pipeline comprising taxonomic scoping, quality control, precursor handling, annotation-based labeling, and Identical Protein Groups (IPG)-based deduplication, with sequence length restricted to 10-200 aa. The AMP proportion varies substantially across length bins (14.2% in [10, 50] aa to 77.1% in [101, 150] aa), identifying length-dependent class imbalance as a distribution shift that benchmarking must account for. The dataset is openly released to support standardized, reproducible, and leakage-free evaluation of AMP prediction models.
Jiang, Z.; Nguyen, C. H.; Mamitsuka, H.
Show abstract
MotivationAccurate prediction of functional sites from primary sequences is essential for elucidating biological mechanisms and advancing rational drug design. However, traditional sequence-based features are inherently unable to capture complex structural protein contexts. Recently, AlphaFold2 (AF2) revolutionized protein structure prediction, raising expectations of AF2 to serve as a feature extractor providing structure-rich representation, which can be useful for sequence-based prediction, particularly for unknown sequences. ResultsWe present a novel feature-engineering paradigm that leverages a high-dimensional latent representation matrix (of L x D, where L is the sequence length and D is the feature dimension size) extracted directly from the AF2 Evoformer module. We systematically evaluated the AF2 representation, comparing with conventional sequence-based features, such as hidden Markov model profiles, using a variety of machine learning models, on two structurally contrasting tasks, calpain cleavage site and nucleic-acid-binding site prediction. The AF2 representation outperformed conventional sequence-based features clearly and entirely, particularly for targets with low sequence homology to training data. Furthermore, interpretability analyses, using SHapley Additive exPlanations (SHAP) and Uniform MAnifoldapproximation and Projection (UMAP), showed more details behind the performance advantage of AF2 representation through feature importance ranking and visualization. Overall, these empirical results confirmed that AF2 representation could effectively bridge the sequence-to-structure gap as a feature input for sequence prediction, without increasing heavy computational burden. Availability and implementationSource code, pre-trained models, and datasets are freely available to non-commercial users at https://github.com/Lili-irtyd/Improve-biological-sequences-prediction-by-AlphaFold2. Contactmami@kuicr.kyoto-u.ac.jp
Xiao, Y.; Zheng, Y.; Hua, Y.; Peng, J.; Liu, J.; Qu, Y.; Xu, J.; Fu, R.; Qian, Q.; Zhao, M.; Zhang, X.; Zhao, J.; Yao, Y.; Kosar, M.; Ke, Y.; Chi, Y.
Show abstract
High-throughput accurate protein-protein interaction (PPI) prediction is foundational to systems-level biological understanding, disease mechanism dissection, and structure-based drug discovery. Traditional graph convolutional networks (GCNs) are limited by discrete information propagation, layer-wise representation homogenization, and absent continuous-time state evolution, failing to capture residues 3D spatial hierarchical dynamic binding patterns. We present LNGCN, a hybrid framework integrating liquid neural networks with GCNs, which encodes residue radial distances as node-level driving terms for continuous updates with hierarchical probabilistic calibration. On standard benchmarks, LNGCN achieves 90% relative AUPRC improvement over PIPR, outperforms RF2-PPI on 1 : 10 imbalanced datasets, and retains 0.9324 AUPRC on held-out yeast test data. LNGCN further demonstrates biological utility in phosphorylation-dependent SHP2 signaling, FGF23-FGFR1c--Klotho ternary assembly, Tdk1 oligomeric-state-dependent interactions, and experimentally validated TPR-mediated candidates. By capturing state-dependent interaction changes, LNGCN provides a scalable framework for PPI screening, candidate prioritization, and future residue-level dynamic PPI trajectory modeling.
Shen, L.; Sun, X.; Zheng, S.; Hashmi, A.; Eriksson, J.; Mustonen, H.; Seppänen, H.; Shen, B.; Li, M.; Vähä-Koskela, M.; Tang, J.
Show abstract
Intratumoral heterogeneity drives variable drug responses in cancer. Single-cell RNA sequencing (scRNA-seq) enables characterization of such heterogeneity and prediction of drug response at single-cell resolution. Accordingly, various computational models have been developed to infer drug response from scRNA-seq data. However, their performance, robustness, and generalizability across different biological contexts remain insufficiently evaluated. To address this gap, we benchmarked representative single-cell drug response prediction models using 26 curated datasets comprising over 760,000 cells across 12 cancer types and 21 therapeutic agents. We constructed balanced and imbalanced scenarios to reflect realistic drug-response label distributions. To address the lack of ground-truth labels in conventional scRNA-seq datasets, we incorporated lineage-tracing data with experimentally validated drug-response annotations, enabling evaluation in a clinically relevant pre-treatment prediction setting. Our results show that prediction performance was markedly higher in cell lines than in tissue samples. Under imbalanced conditions, most methods exhibited sharp performance declines, whereas scDEAL demonstrated the highest robustness. Independent validation using an in-house pancreatic ductal adenocarcinoma dataset further confirmed scDEALs robustness and ability to capture biologically meaningful state transitions. Label-substitution experiment revealed that this robustness was partially driven by the models specific training-label construction. However, benchmarking with lineage-tracing data revealed a fundamental limitation: most models capture drug-induced transcriptional changes but struggled to predict intrinsic resistance before treatment. In summary, our study defines the performance boundaries of current approaches and highlights their limitations in addressing intratumoral heterogeneity, class imbalance, and intrinsic resistance prediction, emphasizing the need for the next-generation single-cell drug response models with stronger clinical relevance.
Duan, H.; Han, X.; Mo, Y.; Ren, B.; Xia, L. C.
Show abstract
MotivationMetagenomic sequencing generates petabyte-scale sequence datasets that strain both deep learning and alignment based enzyme annotation tools. A lightweight rapid and accurate filter tool is needed to identify enzymatic sequences prior to resource-intensive functional prediction. ResultsWe present sxRaep (Rapid and Accurate Enzyme Predictor), a resource-efficient framework using lightweight physicochemical features for enzyme pre-screening. sxRaep achieves 6,604-fold speedup over Diamond (0.002 seconds per inference) with 62.1% memory reduction relative to Diamond (372 MB peak), while maintaining 99.4% accuracy and the highest recall in remote homology detection. This lightweight approach identifies enzymatic candidates missed by alignment-based methods without sacrificing accuracy. Availability and ImplementationsxRaep is available as a Python package at https://pypi.org/project/raep/, is maintained as an open-source software repository at https://github.com/labxscut/sxRaep, and can be deployed using the Docker image cirinmok/raep:python3.11 (https://hub.docker.com/r/cirinmok/raep/tags), which provides a reproducible Python 3.11 environment for enzyme prediction and model execution. Contactlcxia@scut.edu.cn
Zhang, X.
Show abstract
Large language model (LLM) agents are increasingly used for biological data analysis, but prior benchmark results have given a mixed picture of whether they are ready for routine bioinformatics work. The original BixBench study reported only [~] 17-21% accuracy for frontier agents on open-answer bioinformatics questions [1]. Subsequent curation of BixBench-Verified-50 removed or revised ambiguous items, revealing much higher performance for modern agents [2]. Here we evaluate three frontier-model configurations on the 50 verified questions using the same local benchmark, prompt structure, answer format, and grading pipeline: GPT-5.4 with Claude Scientific Skills and no web access, Claude Opus 4.7 with Claude Scientific Skills and no web access, and GPT-5.5 with Claude Scientific Skills, bioSkills, and web access. The three configurations achieve 88.0% (44/50), 84.0% (42/50), and 98.0% (49/50) accuracy, respectively. The remaining GPT-5.5 error is not a clear analytical failure: the agent correctly computed Spearman correlations on the distributed CRISPRGeneEffect.csv values and selected CCND1, whereas the reference answer is recovered only after interpreting stronger essentiality as the opposite sign of the raw gene-effect score. Offline errors mainly occurred when agents lacked pathway, organism-annotation, BUSCO, or PhyKIT-related resources. These results show that frontier agents equipped with high-quality scientific skills can nearly saturate a curated bioinformatics benchmark, while also emphasizing that question wording, score sign conventions, and access to current external resources remain decisive for reliable evaluation.